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Abstract. A standard theoretical paradigm for the formation of large- 
scale structure in the distribution of galaxies has now been established, 
based on the gravitational instability of cold dark matter in a background 
cosmology dominated by vacuum energy. Significant uncertainties remain 
in the modelling of complex astrophysical processes involved in galaxy 
formation, perhaps most fundamentally in the relationship between the 
distributions of luminous galaxies and the underlying dark matter. I 
argue that the Square Kilometre Array is likely to provide information 
crucial to understanding this relationship and how it evolves with time. 



1. Introduction 

Over the last few years, cosmology has witnessed unprecedented improvements 
in our knowledge of the basic parameters governing the expansion of the Uni- 
verse originating with observations of high-redshift supernovae (Riess et al. 1998; 
Perlmutter et al. 1999) and culminating in the recently-released data from the 
WMAP satellite (Spergel et al. 2003). As a result of these developments and 
parallel advances in theory, the field of large-scale structure has now entered a 
period of transition. Before the onset of the current data explosion, there were 
two basic reasons for wanting to study galaxy clustering. One was that it might 
furnish observational ways of pinning down cosmological parameters, and the 
other was that it provided the context within which to study galaxy formation 
and evolution. These two approaches are not mutually exclusive, of course, but 
one might associate the first with cosmologists of a more astrophysical persua- 
sion, whereas the second is more likely to come from particle-cosmologists or 
inflationary specialists. Both points of view have stimulated the development of 
this field over the past twenty years. Now things are changing. Given the appar- 
ent precision with which we now know the cosmological parameters, observations 
of galaxy clustering will at most be seen as consistency checks on the fundamen- 
tal properties of the Universe. On the other hand, accelerating improvements 
of observational technology have opened up the possibility of probing the very 
detailed and subtle properties of galaxies that are regarded as a nuisance to 
those interested in fundamental parameters. 

In this paper, I will argue that the strongest contribution likely to be made 
by Square Kilometre Array, given the timescale required for its completion, is 
likely not to be in the pristine world of particle cosmology but in the grubby 
astrophysics of galaxy formation. I start by giving a very brief overview of 
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structure formation theory for non-specialists and then try to draw out some of 
the areas in which the character of the subject is changing. I will then discuss 
briefly the merits of 21cm galaxy surveys discussed in the science case for SKA 
which can be found at: 

\protect\vrule widthOpt\protect\href {http : / /www . skatelescope . org/ska_science . shtml}-[http 



2. Basics of Cosmological Structure Formation 



2.1. Basic Framework 

The Big Bang theory is built upon the Cosmological Principle, which requires 
the Universe on large scales to be both homogeneous and isotropic. Space-times 
consistent with this requirement can be described by the Robertson-Walker 
metric 

d4Rw = c^t^ - a\t) (^Y^ + ^'d^' + ^^^') ' 

where k is the spatial curvature, scaled so as to take the values or ±1. The case 
K = represents flat space sections, and the other two cases are space sections 
of constant positive or negative curvature, respectively. The time coordinate 
t is called cosmological proper time and it is singled out as a preferred time 
coordinate by the property of spatial homogeneity. The quantity a{t), the cosmic 
scale factor, describes the overall expansion of the universe as a function of time. 
If light emitted at time te is received by an observer at to then the redshift z of 
the source is given by 

l + z=^. (2) 

The dynamics of an FRW universe are determined by the Einstein gravitational 
fleld equations which become 

3(^y = SttGp-^ + A, (3) 
a AttG f p\ a 



aj 



These equations determine the time evolution of the cosmic scale factor a{t) (the 
dots denote derivatives with respect to cosmological proper time t) and therefore 
describe the global expansion or contraction of the universe. The behaviour of 
these models can further be parametrised in terms of the Hubble parameter 
H = a/a and the density parameter O = SirGp/SH^ , a suffix representing the 
value of these quantities at the present epoch when t = tg. The cosmological 
constant is denoted A here, but it can be regarded instead as an additional energy 
density various forms of which have a similar effect; see Huterer &; Turner (2001). 
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2.2. Linear Theory 

In order to understand how structures form we need to consider the difficult 
problem of dealing with the evolution of inhomogeneities in the expanding Uni- 
verse . We are helped in this task by the fact that we expect such inhomogeneities 
to be of very small amplitude early on so we can adopt a kind of perturbative 
approach, at least for the early stages of the problem. If the length scale of the 
perturbations is smaller than the effective cosmological horizon dn = c/Hq, a 
Newtonian treatment of the subject is expected to be valid. If the mean free path 
of a particle is small, matter can be treated as an ideal fluid and the Newtonian 
equations governing the motion of gravitating particles in an expanding universe 
can be written in terms of x = r/a (the comoving spatial coordinate, which is 
fixed for observers moving with the Hubble expansion), v = r — Hr = ax (the 
peculiar velocity field, representing departures of the matter motion from pure 
Hubble expansion), 0(x, t) (the peculiar Newtonian gravitational potential, i.e. 
the fluctuations in potential with respect to the homogeneous background) and 
p(x, i) (the matter density). Using these variables we obtain, first, the Euler 
equation: 

+ (v ■ VJv = -iVxP - Vx</. . (6) 
ot p 

The second term on the right-hand side of equation (6) is the peculiar grav- 
itational force, which can be written in terms of g = — Vx?i>/a, the peculiar 
gravitational acceleration of the fluid element. If the velocity flow is irrota- 
tional, V can be rewritten in terms of a velocity potential (p^: v = — Vx?5'i;/o- 
Next we have the continuity equation: 

^ + 3ifp+lVx(pv)=0, (7) 

which expresses the conservation of matter, and finally the Poisson equation: 

Vx2(/) = AirGa^ip - po) = AirGa^pod, (8) 
describing Newtonian gravity. Here po is the mean background density, and 

(9) 

Po 

is the density contrast. 

The next step is to linearise the Euler, continuity and Poisson equations by 
perturbing physical quantities defined as functions of Eulerian coordinates, i.e. 
relative to an unperturbed coordinate system. Expanding p, v and (p perturba- 
tively and keeping only the first-order terms in equation (7) gives the linearised 
continuity equation: 

f = --Vx-v, (10) 
ot a 

which can be inverted, with a suitable choice of boundary conditions, to yield 

^ (Vx-v). (11) 
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The function / ~ Qq^; this is simply a fitting formula to the full solution. The 
linearised Euler and Poisson equations are 

dv a 1 „ 1^ , 

+ -V = VxP - - Vx</>, 12 

at a pa a 

V^^cf) = AirGa^poS; (13) 

|v|, \(f)\, \S\ ^ 1 in equations (10), (12) & (13). Prom these equations, and if one 
ignores pressure forces, it is easy to obtain an equation for the evolution of S: 

6 + 2H5 - ^nn^d = 0. (14) 

For a spatially flat universe dominated by pressureless matter, po{t) = l/GvrGi^ 
and equation (14) admits two linearly independent power law solutions 5(x, t) = 
L>±(i)(5(x), where D+{t) oc a{t) oc t^/^ jg ^j^g 

growing mode and D-{t) oc t ^ is 

the decaying mode. 

2.3. Primordial density fluctuations 

The above considerations apply to the evolution of a single Fourier mode of the 
density field 5(x, t) = D_|_(t)5(x). What is more likely to be relevant, however, 
is the case of a superposition of waves, resulting from some kind of stochastic 
process in which he density field consists of a superposition of such modes with 
different amplitudes. A statistical description of the initial perturbations is 
therefore required, and any comparison between theory and observations will 
also have to be statistical. 

The spatial Fourier transform of 5(x) is 

^^""^ = (2^ / d'xe-*-^(5(x). (15) 

It is useful to specify the properties of 5 in terms of 5. We can define the power- 
spectrum of the field to be (essentially) the variance of the amplitudes at a given 
value of k: 

(<5(ki)5(k2)) = P(/ci)5^(ki + ks), (16) 

where 5^ is the Dirac delta function; this rather cumbersome definition takes 
account of the translation symmetry and reality requirements for P{k); isotropy 
is expressed by P(k) = P{k). The analogous quantity in real space is called the 
two-point correlation function or, more correctly, the autocovariance function, 
of 5(x): 

(5(xi)<5(x2)) = ^(|xi - X2|) = ^(r) = ^(r), (17) 

which is itself related to the power spectrum via a Fourier transform. The power- 
spectrum is particularly important because it provides a complete statistical 
characterisation of a particular kind of stochastic process: a Gaussian random 
field. This class of field is the generic prediction of inflationary models, in which 
the density perturbations are generated by Gaussian quantum fluctuations in a 
scalar field during the inflationary epoch (e.g. Brandenberger 1985). 
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The shape of the initial fluctuation spectrum, is assumed to be imprinted 
on the universe at some arbitrarily early time. Many versions of the inflationary 
scenario for the very early universe (Guth 1981) produce a power-law form 

P{k) = Ak^, (18) 

with a preference in some cases for the Harrison-Zel'dovich form with n = \ 
(Harrison 1970; Zel'dovich 1972). Even if inflation is not the origin of density 
fluctuations, the form (18) is a useful phenomenological model for the fluctuation 
spectrum. 

These considerations specify the shape of the fluctuation spectrum, but not 
its amplitude. The discovery of temperature fluctuations in the CMB (Smoot 
et al. 1992) plugged that gap. 

2.4. The transfer function 

We have hitherto assumed that the effects of pressure and other astrophysical 

processes on the gravitational evolution of perturbations arc negligible. In fact, 
depending on the form of any dark matter, and the parameters of the back- 
ground cosmology, the growth of perturbations on particular length scales can 
be suppressed relative to the growth laws discussed above. 

We need first to specify the fluctuation mode. In cosmology, the two relevant 
alternatives are adiabatic and isocurvature. The former involve coupled fluctua- 
tions in the matter and radiation component in such a way that the entropy does 
not vary spatially; the latter have zero net fluctuation in the energy density and 
involve entropy fluctuations. Adiabatic fluctuations are the generic prediction 
from inflation and form the basis of most currently fashionable models, although 
interesting work has been done on isocurvature models (e.g. Peebles 1999). 

In the classical Jeans instability, pressure inhibits the growth of structure 
on scales smaller than the distance traversed by an acoustic wave during the 
free-fall collapse time of a perturbation. If there are collisionless particles of 
hot dark matter, they can travel rapidly through the background and this free 
streaming can damp away perturbations completely. Radiation and relativis- 
tic particles may also cause kinematic suppression of growth. The imperfect 
coupling of photons and baryons can also cause dissipation of perturbations 
in the baryonic component. The net effect of these processes, for the case of 
statistically homogeneous initial Gaussian fluctuations, is to change the shape 
of the original power-spectrum in a manner described by a simple function of 
wave-number - the transfer function T{k) - which relates the processed power- 
spectrum P{k) to its primordial form Po{k) via P{k) = Po{k) x T'^{k). The 
results of full numerical calculations of all the physical processes we have dis- 
cussed can be encoded in the transfer function of a particular model (Bardeen et 
al. 1986). For example, fast moving or 'hot' dark matter particles (HDM) erase 
structure on small scales by the free-streaming effects mentioned above so that 
T{k) exponentially for large k\ slow moving or 'cold' dark matter (CDM) 
docs not suffer such strong dissipation, but there is a kinematic suppression of 
growth on small scales (to be more precise, on scales less than the horizon size at 
matter-radiation equality); signiflcant small-scale power nevertheless survives in 
the latter case. These two alternatives thus furnish two very different scenarios 
for the late stages of structure formation: the 'top-down' picture exemplified by 
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Figure 1. Examples of adiabatie transfer functions for baryons, hot 
dark matter (HDM) , cold dark matter (CDM) and mixed dark matter 
(MDM; also known as CHDM). Isocurvature modes are also shown. 
Picture courtesy of John Peacock. 



HDM first produces super clusters, which subsequently fragment to form galax- 
ies; CDM is a 'bottom-up' model because small-scale structures form first and 
then merge to form larger ones. The general picture that emerges is that, while 
the amplitude of each Fourier mode remains small, i.e. (5(k) <^ 1, linear the- 
ory applies. In this regime, each Fourier mode evolves independently and the 
power-spectrum therefore just scales as 

For scales larger than the Jeans length, this means that the shape of the power- 
spectrum is preserved during linear evolution. 

2.5. Beyond linear theory 

The linearised equations of motion provide an excellent description of gravita- 
tional instability at very early times when density fluctuations are still small 
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((5 <C 1). The linear regime of gravitational instability breaks down when 6 be- 
comes comparable to unity, marking the commencement of the quasi-linear (or 
weakly non-linear) regime. During this regime the density contrast may remain 
small (6 < 1), but the phases of the Fourier components 5k become substan- 
tially different from their initial values resulting in the gradual development of a 
non-Gaussian distribution function if the primordial density field was Gaussian. 
In this regime the shape of the power-spectrum changes by virtue of a compli- 
cated cross-talk between different wave-modes. Analytic methods are available 
for this kind of problem , but the usual approach is to use A^-body experiments 
for strongly non-linear analyses (Davis ct al. 1985; Jenkins et al. 1999). 

Further into the non-linear regime, bound structures form. The baryonic 
content of these objects may then become important dynamically: hydrodynam- 
ical effects (e.g. shocks), star formation and heating and cooling of gas all come 
into play. The spatial distribution of galaxies may therefore be very different 
from the distribution of the (dark) matter, even on large scales. Attempts are 
only just being made to model some of these processes with cosmological hy- 
drodynamics codes, but it is some measure of the difficulty of understanding 
the formation of galaxies and clusters that most studies have only just begun 
to attempt to include modelling the detailed physics of galaxy formation. In 
the front rank of theoretical efforts in this area arc the so-called semi-analytical 
models which encode simple rules for the formation of stars within a framework 
of merger trees that allows the hierarchical nature of gravitational instability to 
be explicitly taken into account (Baugh ct al. 1998). 

The usual approach is instead simply to assume that the point-like distri- 
bution of galaxies, galaxy clusters or whatever, 

n{r) = Y,SD{r-ri), (20) 

i 

bears a simple functional relationship to the underlying 6{r). An assumption 
often invoked is that relative fluctuations in the object number counts and matter 
density fluctuations are proportional to each other, at least within sufficiently 
large volumes, according to the linear biasing prescription: 

Snjr) ^ ^ 6p{r) 
n p 

where b is what is usually called the biasing parameter. Alternatives, which 
are not equivalent, include the high-peak model (Kaiser 1984; Bardeen et al. 
1986) and the various local bias models (Coles 1993). Non-local biases are 
possible, but it is rather harder to construct such models (Bower et al. 1993). 
If one is prepared to accept an ansatz of the form (21) then one can use linear 
theory on large scales to relate galaxy clustering statistics to those of the density 
fiuctuations, e.g. 

Pg^i{k)=b^P{k). (22) 

This approach is the one most frequently adopted in practice, but the community 
is becoming increasingly aware of its severe limitations. A simple parametrisa- 
tion of this kind simply cannot hope to describe realistically the relationship 
between galaxy formation and environment (Dekel &; Lahav 1999). 



(21) 
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3. Large-scale Structure: Past and Present 
3.1. Modelling 

Models of structure formation involve many ingredients which interact in a com- 
plicated way: (i) A background cosmology, basically a choice of J7o, Hq and A 
if we are prepared to stick with the Robertson- Walker metric (1) and the Ein- 
stein equations (3)- (5); (ii) an initial fluctuation spectrum, usually taken to 
be a power-law usually with n = 1; (iv) a choice of fluctuation mode, usu- 
ally adiabatic; (iii) a statistical distribution of fluctuations usually Gaussian; 
(v) a transfer function, which requires knowledge of the relevant proportions of 
'hot', 'cold' and baryonic material as well as the number of relativistic particle 
species; (vi) a 'machine' for handling non-linear evolution, so that the distribu- 
tion of galaxies and other structures can be predicted, usually an A^-body code, 
an approximated dynamical calculation or simply, with fingers crossed, linear 
theory; (vii) a prescription for relating fluctuations in mass to fluctuations in 
light, frequently the linear bias model. I will now discuss how the attitude to 
these ingredients has changed in the past, and is likely to in the near future. 

Historically speaking, the first model incorporating non-baryonic dark mat- 
ter to be seriously considered was the hot dark matter (HDM) scenario, in which 
the universe is dominated by a massive neutrino with mass around 10-30 eV. 
This scenario has fallen into disrepute because the copious free streaming it pro- 
duces smooths the matter fluctuations on small scales and means that galaxies 
form very late. The favoured alternative for most of the 1980s was the cold dark 
matter (CDM) model in which the dark matter particles undergo negligible free 
streaming owing to their higher mass or non-thermal behaviour. A 'standard' 
CDM model (SCDM) then emerged in which the cosmological parameters were 
fixed at J7o = 1 and h = 0.5, the spectrum was of the Harrison-Zel'dovich 
form with n = 1 and a significant bias, b = 1.5 to 2.5, was required to fit the 
observations (Davis et al. 1985). 

The SCDM model was ruled out by a combination of the COBE-inferred 
amplitude of primordial density fluctuations, galaxy clustering power-spectrum 
estimates on large scales, cluster abundances and small-scale velocity dispersions 
(Peacock &: Dodds 1996). It seems the standard version of this theory simply has 
a transfer function with the wrong shape to accommodate all the available data 
with an n = 1 initial spectrum. Nevertheless, because CDM is such a successful 
first approximation and seems to have gone a long way to providing an answer to 
the puzzle of structure formation, the response of the community has not been 
to abandon it entirely, but to seek ways of relaxing the constituent assumptions 
in order to get a better agreement with observations. Various possibilities have 
been suggested. 

If the total density is reduced to f^o — 0.3, which is favoured by many 
arguments, then the size of the horizon at matter-radiation equivalence increases 
compared with SCDM and much more large-scale clustering is generated. . 
This is called the open cold dark matter model, or OCDM for short. Those 
unwilling to dispense with the inflationary predeliction for flat spatial sections 
have invoked CIq = 0.2 and a positive cosmological constant to ensure that 
A; = 0; this can be called ACDM and is also favoured by observations of distant 
supernovae . Much the same effect on the power spectrum may also be obtained 
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in r2 = 1 CDM models if matter-radiation equivalence is delayed, such as by the 
addition of an additional relativistic particle species. The resulting models are 
usually called rCDM. 

Another alternative to SCDM involves a mixture of hot and cold dark mat- 
ter (CHDM), having perhaps Ohot = 0.3 for the fractional density contributed 
by the hot particles. For a fixed large-scale normalisation, adding a hot com- 
ponent has the effect of suppressing the power-spectrum amplitude at small 
wavelengths . T Another possibility is to invoke non-flat initial fluctuation spec- 
tra, while keeping everything else in SCDM fixed. The resulting 'tilted' models, 
TCDM, usually have n < 1 power-law spectra for extra large-scale power and, 
perhaps, a significant fraction of tensor perturbations. Models have also been 
constructed in which non-power-law behaviour is invoked to produce the required 
extra power: these are the broken scale-invariance (BSI) models. 

3.2. Past Observational Developments 

In 1986, the CfA survey (de Lapparent, Geller &: Huchra 1986) was the 'state- 
of-the-art', but this contained redshifts of only around 2000 galaxies with a 
maximum recession velocity of 15 000 km s~^. The subsequent Las Campanas 
survey contained around six times as many galaxies, and goes out to a velocity 
of 60 000 km s~^ (Shectman et al. 1996). Quantitative measures of spatial clus- 
tering obtained from these data sets offer the simplest method of probing P{k), 
assuming that these objects are related in some well-defined way to the mass 
distribution and this, through the transfer function, is one way of constraining 
cosmological parameters. For example. Peacock &: Dodds (1996) made compila- 
tions of power-spectra of different kinds of galaxy and cluster redshift samples. 
Within the (considerable) observational errors, and the uncertainty introduced 
by modelling of the bias, all the data lie roughly on the same curve. A consistent 
picture thus emerged in which galaxy clustering extends over larger scales than 
is expected in the standard CDM scenario. It was difficult to say much in terms 
of testing the variations on the CDM theme I have discussed so far, however, 
because of the sparseness and limited scale coverage of the available data. 

3.3. The Present: Entering the Precision Era 

The next generation of redshift surveys, prominent among which are the Sloan 
Digital Sky Survey of about one million galaxy redshifts (Gunn & Weinberg 
1995) and an Anglo- Australian collaboration using the two-degree field facility 
(Colless et al. 2001). The latter survey, called 2dFGRS, has now finished taking 
data while the Sloan Survey is still in progress. Both exploit multi-fibre methods 
that can obtain 400 galaxy spectra in one go, and will increase the number of 
redshifts by about two orders of magnitude over those previously available. The 
huge increase in survey depth (2dFGRS reaches redshifts z ~ 0.3) has allowed a 
much better measurement of the matter power-spectrum (Percival et al. 2001) 
and better statistics have allowed some progress to be made using higher-order 
statistical diagnostics of non-linearity and bias (Verde et al. 2002). 

It is evident from Figure 2 that, although the three non-SCDM models are 
similar at z = 0, differences between them are marked at higher redshift. This 
suggests the possibility of using measurements of galaxy clustering at high red- 
shift to distinguish between models and reality. This has now become possible. 
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with surveys of galaxies at z ~ 3 already being constructed (Steidel et al. 1998, 
1999). Unfortunately, the interpretation of these new data is less straightfor- 
ward than one might have imagined. If the galaxy distribution is biased at ^ = 
then the bias is expected to grow with z (Davis et al. 1985). If galaxies are rare 
peaks now, they should have been even rarer at high z. There are also many 
distinct possibilities as to how the bias might evolve with redshift (Matarrese et 
al. 1997; Moscardini ct al. 1998; Coles et al. 1998). 

But large-scale structure is not just about clustering power spectra. There 
are other ways in which it is possible to use information about the velocities 
of galaxies to constrain models (Strauss & Willick 1995). Probably the most 
useful information pertains to large-scale motions, as small-scale data populate 
the highly nonlinear regime. The basic principle is that velocities are induced 
by fluctuations in the total mass, not just the galaxies. Comparing measured 
velocities with measured fluctuations in galaxies with measured fluctuations in 
galaxy counts, it is possible to constrain both Q and b. From equations (10) to 
(13) it emerges that 

2/ ^ , const 



which demonstrates that the velocity flow associated with the growing mode in 
the linear regime is curl-free, as it can be expressed as the gradient of a scalar 
potential function. Notice also that the induced velocity depends on J7. This is 
the basis of a method for estimating Q which is known as POTENT. Since all 
matter gravitates, not just the luminous material, there is a hope that methods 
such as this can break the degeneracy between clustering induced by gravity 
and that induced statistically, by bias. See Dekel (1994) for a review. These 
methods are prone to error if there are errors in the velocity estimates. Perhaps 
a more robust approach is to use peculiar motion information indirectly, by 
the effect they have on the distribution of galaxies seen in redshift-space (i.e. 
assuming total velocity is proportional to distance). The information gained 
this way is statistical, but less prone to systematic error (Peacock et al. 2001) 
and the evolution of the eflFect with redshift is also a test of cosmological models 
(BalHnger, Peacock & Heavens 1996). 

Another class of observations that can help break the degeneracy between 
models involves gravitational lensing. The most spectacular forms of lensing 
are those producing multiple images or strong distortions in the form of arcs. 
These require very large concentrations of mass and are therefore not so useful 
for mapping the structure on large scales. However, there are lensing effects 
that are much weaker than the formation of multiple images. In particular, 
distortions producing a shearing of galaxy images promise much in this regard 
(Kaiser & Squires 1993). With the advent of new large CCD detectors, this 
should soon be realised (Mellier 1999). 

The combination of lensing, peculiar motions and galaxy clustering studies 
would be impressive enough even without the dramatic arrival of WMAP on the 
scene (Bennett et al. 2003) . The WMAP data have really heralded the precision 
era, allowing direct determinations of the primordial fluctuation spectrum and 
the basic cosmological parameters in a manner that bypasses most of important 
sources of uncertainty in clustering analysis. 
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The bumps and wiggles shown in the transfer functions of Figure 1 do 
find themselves into the present-day spectrum of galaxy clustering, but they are 
strongly affected by non-linear evolution on the way. Moreover galaxy surveys 
probe the distribution of luminous matter so one can't infer the matter spec- 
trum directly from that of galaxies without a model for the bias. Galaxies also 
have peculiar motions so their redshifts do not exactly represent their proper 
distances. Survey determinations of P{k) will inevitably be harder to interpret 
to those obtained from the cosmic microwave background, where none of these 
complications arise (Hinshaw et al. 2003). This is the reason for the tremendous 
precision of WMAP's determination of cosmological parameters (Spergel et al. 
2003), which can be improved still further by combining constraints from the 
2dFGRS and lensing studies. Nevertheless, there are very strong possibilities 
that a redshift survey performed with the SKA could probe both the spectrum 
and the background cosmology, for example by using the 'wiggles' as standard 
rulers (e.g. Blake & Glazebrook 2003). 



4. The Way Ahead: A Role for SKA Redshift Surveys? 

WMAP, 2dFGRS and the other manifestations of precision cosmology have cer- 
tainly made great strides towards the determination of the cosmological parame- 
ters. The standard model that has emerged (which is very similar to the ACDM 
model described above). Although it would be premature to say that no depar- 
tures from this model are possible, the emphasis as far as galaxy clustering is 
concerned will be away from its use as a probe of the background cosmology. So 
what is the future? And is there a role for the SKA in cosmological studies other 
than consistency checks of the standard model? The answer to both questions 
is emphatically "yes" . 

One can see evidence of a new direction already. Some of the most inter- 
esting results to have emerged from 2dFGRS concern the clustering of galaxies 
selected by spectral type (Madgwick et al. 2003). Preliminary results from 
the Sloan Digital Sky Survey reveal a complicated dependence of clustering the 
colours of selected galaxies (Zehavi ct al. 2002). There is evidence of clustering 
dependence on intrinsic galaxy properties emerging also from infra-red selected 
galaxies (Hawkins et al. 2001). While these dependencies are simply a nui- 
sance when it comes to determining cosmological parameters, they indicate that 
the large-scale distribution of galaxy clustering may hold clues to their forma- 
tion process. The relative clustering strength of different populations may be 
complex and scale-dependent, requiring more sophisticated description that the 
simple bias parameters described above. 

In principle observations such as these can be used to test semi-analytic 
models of galaxy formation of the form discussed by Baugh et al. (1998). On 
the other hand, all the classes of galaxy mentioned are selected by radiation 
coming from sources with a complex and poorly understood formation process. 
The Square Kilometre Array could produce a great step forward in this area, by 
mapping galaxy positions and redshifts in neutral hydrogen via the 21cm line. 
Detailed theoretical predictions are so far lacking, but two "straw man" surveys 
are described in the SKA science case. 
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4.1. The SKA Shallow Survey 

The first case is a "traditional" redshift survey along the lines of 2dFGRS but 
using 21cm to select the galaxies. Depending on the eventual choice of instru- 
mental sensitivity, such a survey might take 12 months, cover about 1000 square 
degrees of sky and be capable of detecting galaxies out to z ~ 2; compare the 
limit z ~ 0.3 of 2dFGRS. All in all, this means a survey of around ~ 10^ galaxies 
in a volume of order 10^ Mpc. This is impressive enough in itself, but such a 
survey would also bring with it the possibility of HI Tully-Fishcr measurements 
for the galaxies in it. In this respect its nearest present relative is the 6dF galaxy 
survey described at: 

\protect\vrule widthOpt\protect\href {http : / /www . mso . anu . edu . au/ 6dFGS/ 6dF_survey_plan . htm 

The potential to combine rcdshifts and TuUy-Fisher distances enables ve- 
locity field mapping on an immense scale. 

4.2. The SKA Pencil Beam Survey 

An alternative mode of redshift survey for SKA is to look at a smller area for 
much longer. Using the same sensitivity as in the previous example of a shallow 
survey, a 360 hour survey covering one square degree could contain 10^ galaxies. 
A present-day L* galaxy could be detected in its HI emission out to a redshift 
z ^ 3. The limiting HI mass would be a few times 10^ Mq at z ~ 4 and of order 
10^ Mq at z ~ 1. 

The possibility of detecting objects at high redshift offers the prospect of 
constraining models of galaxy formation extremely strongly. In all hierarchical 
clustering models, the bias associated with galaxies increases dramatically with 
redshift. This results in a strange conspiracy: the matter correlations decrease 
with increase redshift while the bias increases in compensation, producing a very 
slow evolution of measured clustering with epoch. However, the probes we have 
of high-redshift clustering, such as Lyman-break galaxies (Steidel et al. 1998, 
1999) and QSOs (Outram et al. 2001), suffer from low samphng density and 
uncertain interpretation of the host object. More importantly, the supply of cold 
gas plays a central role in the detailed semi-analytic models of galaxy formation 
and the evolution of HI mass function with redshift will be a decisive test of the 
basic framework. However, much theoretical work is needed to make detailed 
predictions for such surveys. The hierarchical nature of structure formation 
involves gas being distributed in less massive haloes at high redshift, but gas 
is also used up to form stars as time goes. The number of HI sources seen 
as a function of redshift may be drastically different from that expected of a 
non-evolving population of present-day galaxies. 

5. Discussion and Conclusions 

I have emphasized the importance of clustering properties and their implications 
for galaxy and large-scale structure formation. There will no doubt be many that 
disagree with this emphasis. Large-scale matter power spectrum determination 
will be possible using SKA and will be enormously better even that 2dFGRS 
or Sloan. Such studies are well-worth doing, as are the numerous possible tests 
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of departures from the standard model, especially with respect to the possible 
forms of dark energy such as quintessence (Huterer & Turner 2001). A 21cm 
survey would be better fitted to such a task than QSO surveys (e.g. Outram et 
al. 2001) because of the higher sampling density. 

Interesting though the results of such studies will be, they will almost cer- 
tainly turn out merely to provide consistency checks on a cosmology largely fixed 
by studies of the cosmic microwave background. For me, the the distribution 
of cold gas on large-scales, how it relates to stellar populations of various kinds 
and how the supply of this gas has evolved with cosmic epoch offers the rich- 
est scientific possibilities. What is now needed is proper theoretical modelling of 
Hl-sclected galaxies to produce mock catalogues to drive the science case further 
forward. Watch this space. 

Acknowledgments. This style file is based on one provided by PASP. I 
thank the editors for their patience! 
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